智能论文笔记

Development of a Thermodynamics of Human Cognition and Human Culture

Diederik Aerts , Jonito Aerts Arguëlles , Lester Beltran , Sandro Sozzo

分类：自然语言处理

2022-12-24

Inspired by foundational studies in classical and quantum physics, and by information retrieval studies in quantum information theory, we have recently proved that the notions of 'energy' and 'entropy' can be consistently introduced in human language and, more generally, in human culture. More explicitly, if energy is attributed to words according to their frequency of appearance in a text, then the ensuing energy levels are distributed non-classically, namely, they obey Bose-Einstein, rather than Maxwell-Boltzmann, statistics, as a consequence of the genuinely 'quantum indistinguishability' of the words that appear in the text. Secondly, the 'quantum entanglement' due to the way meaning is carried by a text reduces the (von Neumann) entropy of the words that appear in the text, a behaviour which cannot be explained within classical (thermodynamic or information) entropy. We claim here that this 'quantum-type behaviour is valid in general in human cognition', namely, any text is conceptually more concrete than the words composing it, which entails that the entropy of the overall text decreases. This result can be prolonged to human culture and its collaborative entities having lower entropy than their constituent elements. We use these findings to propose the development of a new 'non-classical thermodynamic theory for human cognition and human culture', which bridges concepts and quantum entities and agrees with some recent findings on the conceptual, not physical, nature of quantum entities.

translated by 谷歌翻译

A Planck Radiation and Quantization Scheme for Human Cognition and Language

Diederik Aerts , Lester Beltran

分类：自然语言处理

2022-01-10

由于鉴定了“身份”和“欺诈性”和强大的实验证据，在人类认知和语言中存在相关的Bose-Einstein统计数据，我们在以前的工作中争论了量子认知研究领域的延伸。除了量子复杂的矢量空间和量子概率模型之外，我们还表明量化本身，用词为量子，对人类认知是相关的和可能的重要性。在目前的工作中，我们在此结果构建，并引入了用于人类认知的强大辐射量化方案。我们表明，与Maxwell-Boltzmann统计数据相比，缺乏Bose-Einstein统计数据的独立性可以通过存在“含义动态”来解释，这导致与同一话语吸引的话语。因此，在同一个状态中，单词聚集在一起，在量子力学的早期众所周知的光子中熟知的现象，导致普朗克和爱因斯坦之间的激烈分歧。使用一个简单的例子，我们介绍了所有元素，以获得更好，更详细地了解这一“意义动态”，例如微型和宏状态，以及Maxwell-Boltzmann，Bose-Einstein和Fermi-Dirac编号和权重，并比较这一点示例及其图表，具有Winnie The PoOH故事的辐射量化方案，也具有图表。通过将概念直接连接到人类体验，我们表明纠缠是保留我们所识别的“意义动态”的必要性，并且在Fermi-Dirac解决人类记忆的方式变得清晰。在那里，在具有内部参数的空格中，可以分配不同的单词。

translated by 谷歌翻译

Are Words the Quanta of Human Language? Extending the Domain of Quantum Cognition

Diederik Aerts , Lester Beltran

分类：自然语言处理

2021-10-10

在以前的研究中，我们展示了“讲故事”的文本展示了不是Maxwell-Boltzmann但Bose-Einstein的统计结构。我们的解释是，这是由于在人类语言中存在“无法区分”，因此故事的不同部分中的相同词语彼此无法区分。在目前的文章中，我们开始为此Bose-Einstein统计提供解释。我们表明，在“故事”中存在“意义”，这导致了Bose-eInstein的独立特征，并提供了确凿的证据，即“言语可以被认为是人类语言”，结构类似于如何“光子是光的量子”。使用若干关于我们布鲁塞尔研究组的纠缠研究，我们还表明它也是在文本中存在“含义”，这使得von Neumann熵相对于组成它的单词熵的总文本更小。我们解释了本文的新见解如何与称为“量子认知”的研究领域适合，其中量子概率模型和量子矢量空间用于人类认知，并且也与使用量子结构在信息检索和自然中的使用相关语言处理，以及它们如何将“量化”和“Bose-Einstein统计数据”引入那里的相关量子效应。灵感来自量子力学的概念性解释，并依靠新的见解，我们提出了关于物理现实性质的假设。在这样做时，我们注意到这种新的熵减少以及其解释，对量子热力学的发展可能是重要的。我们同样注意到它也可以引起行星地球表面上的物理现实性质的原始解释图片，其中人类文化随着养护的延续而出现。

translated by 谷歌翻译

Multi hash embeddings in spaCy

Lester James Miranda , Ákos Kádár , Adriane Boyd , Sofie Van Landeghem , Anders Søgaard , Matthew Honnibal

分类：自然语言处理

2022-12-19

The distributed representation of symbols is one of the key technologies in machine learning systems today, playing a pivotal role in modern natural language processing. Traditional word embeddings associate a separate vector with each word. While this approach is simple and leads to good performance, it requires a lot of memory for representing a large vocabulary. To reduce the memory footprint, the default embedding layer in spaCy is a hash embeddings layer. It is a stochastic approximation of traditional embeddings that provides unique vectors for a large number of words without explicitly storing a separate vector for each of them. To be able to compute meaningful representations for both known and unknown words, hash embeddings represent each word as a summary of the normalized word form, subword information and word shape. Together, these features produce a multi-embedding of a word. In this technical report we lay out a bit of history and introduce the embedding methods in spaCy in detail. Second, we critically evaluate the hash embedding architecture with multi-embeddings on Named Entity Recognition datasets from a variety of domains and languages. The experiments validate most key design choices behind spaCy's embedders, but we also uncover a few surprising results.

translated by 谷歌翻译

Controlling Moments with Kernel Stein Discrepancies

Heishiro Kanagawa , Arthur Gretton , Lester Mackey

分类： (统计)机器学习 | 机器学习

2022-11-10

Quantifying the deviation of a probability distribution is challenging when the target distribution is defined by a density with an intractable normalizing constant. The kernel Stein discrepancy (KSD) was proposed to address this problem and has been applied to various tasks including diagnosing approximate MCMC samplers and goodness-of-fit testing for unnormalized statistical models. This article investigates a convergence control property of the diffusion kernel Stein discrepancy (DKSD), an instance of the KSD proposed by Barp et al. (2019). We extend the result of Gorham and Mackey (2017), which showed that the KSD controls the bounded-Lipschitz metric, to functions of polynomial growth. Specifically, we prove that the DKSD controls the integral probability metric defined by a class of pseudo-Lipschitz functions, a polynomial generalization of Lipschitz functions. We also provide practical sufficient conditions on the reproducing kernel for the stated property to hold. In particular, we show that the DKSD detects non-convergence in moments with an appropriate kernel.

translated by 谷歌翻译

Targeted Separation and Convergence with Kernel Discrepancies

Alessandro Barp , Carl-Johann Simon-Gabriel , Mark Girolami , Lester Mackey

分类： (统计)机器学习 | 机器学习

2022-09-26

最大平均差异（MMD）（例如内核Stein差异（KSD））已成为广泛应用的中心，包括假设测试，采样器选择，分布近似和变异推断。在每种情况下，这些基于内核的差异度量都需要（i）（i）将目标p与其他概率度量分开，甚至（ii）控制弱收敛到P。在本文中，我们得出了新的足够和必要的条件，以确保（i）（ii）。对于可分开的度量空间上的MMD，我们表征了那些将BOCHNER嵌入量度分开的内核，并引入了简单条件，以将所有措施用无限的内核分开，并控制与有界内核的收敛。我们在$ \ mathbb {r}^d $上使用这些结果来实质性地扩大了KSD分离和收敛控制的已知条件，并开发了已知的第一个KSD，以恰好将弱收敛到P。我们的假设检验，测量和改善样本质量以及用Stein变异梯度下降进行抽样的结果。

translated by 谷歌翻译

Adaptive Bias Correction for Improved Subseasonal Forecasting

Soukayna Mouatadid , Paulo Orenstein , Genevieve Flaspohler , Judah Cohen , Miruna Oprescu , Ernest Fraenkel , Lester Mackey

分类：机器学习 | (统计)机器学习

2022-09-21

季节预测$ \ unicode {x2013} $预测温度和降水量为2至6周$ \ unicode {x2013} $，对于有效的水分配，野火管理，干旱和缓解洪水至关重要。最近的国际研究工作提高了操作动力学模型的亚季节能力，但是温度和降水预测技能仍然很差，部分原因是代表动态模型内大气动力学和物理学的顽固错误。为了应对这些错误，我们引入了一种自适应偏置校正（ABC）方法，该方法将最新的动力学预测与使用机器学习的观察结合在一起。当应用于欧洲中等天气预测中心（ECMWF）的领先的亚季节模型时，ABC将温度预测技能提高了60-90％，在美国的连续美国，降水预测技能提高了40-69％基于Shapley队列的实用工作流程，用于解释ABC技能的提高并根据特定的气候条件识别机遇的高技能窗口。

translated by 谷歌翻译

Rethinking Cost-sensitive Classification in Deep Learning via Adversarial Data Augmentation

Qiyuan Chen , Raed Al Kontar , Maher Nouiehed , Jessie Yang , Corey Lester

分类：机器学习 | (统计)机器学习

2022-08-24

成本敏感的分类对于错误分类错误的成本差异很大，至关重要。但是，过度参数化对深神经网络（DNNS）的成本敏感建模构成了基本挑战。 DNN完全插值训练数据集的能力可以渲染DNN，纯粹在训练集上进行评估，无效地区分了成本敏感的解决方案和其总体准确性最大化。这需要重新思考DNN中的成本敏感分类。为了应对这一挑战，本文提出了一个具有成本敏感的对抗数据增强（CSADA）框架，以使过度参数化的模型成本敏感。总体想法是生成针对性的对抗示例，以推动成本感知方向的决策边界。这些有针对性的对抗样本是通过最大化关键分类错误的可能性而产生的，并用于训练一个模型，以更加保守的对成对的决策。公开可用的有关著名数据集和药物药物图像（PMI）数据集的实验表明，我们的方法可以有效地最大程度地减少整体成本并减少关键错误，同时在整体准确性方面达到可比的性能。

translated by 谷歌翻译

HTML版本

Reducing Retraining by Recycling Parameter-Efficient Prompts

Brian Lester , Joshua Yurtsever , Siamak Shakeri , Noah Constant

分类：自然语言处理

2022-08-10

参数效率的方法能够使用单个冷冻的预训练的大语言模型（LLM）来通过学习特定于任务的软提示来执行许多任务，从而在串联到输入文本时调节模型行为。但是，这些学习的提示与给定的冷冻模型紧密耦合 - 如果模型已更新，则需要获得相应的新提示。在这项工作中，我们提出并调查了几种“提示回收”的方法，其中将在源模型上进行了及时培训以与新目标模型一起使用。我们的方法不依赖于目标模型的有监督的提示，特定于任务的数据或培训更新，这与从头开始的目标模型重新调整提示一样昂贵。我们表明，模型之间的回收是可能的（我们的最佳设置能够成功回收$ 88.9 \％的提示，从而产生一个提示，即表现出色的基线），但是剩下的大量性能净空，需要改进的回收技术。

translated by 谷歌翻译

An Ontological Approach to Analysing Social Service Provisioning

Mark S. Fox , Bart Gajderowicz , Daniela Rosu , Alina Turner , Lester Lyu

分类：人工智能

2022-06-20

本文介绍了在智能城市环境中评估和管理社会服务覆盖范围所需的本体论概念。在这里，我们专注于关键利益相关者的观点，即社会目的组织及其服务的客户。此处介绍的指南针本体论通过引入与关键维度相关的新概念来扩展共同的影响数据标准：WHO（利益相关者），什么（需求，需要满足，结果），如何（服务，事件）和贡献（跟踪）（跟踪）资源）。该论文首先介绍了关键的利益相关者，服务，成果，事件，需求和需求满意度以及其定义。其次，提出了一部分能力问题，以说明主要利益相关者提出的问题的类型。第三，通过在基于指南针的知识图上介绍SPARQL查询并分析其结果，可以评估扩展程序回答问题的能力。

translated by 谷歌翻译